Mon. Not. R. Astron. Soc. 000, 000-000 (0000) Printed 5 February 2008 (MN KTpX style file v2.2) 



A Robust Estimator of the Small-Scale Galaxy Correlation Function 



Nikhil Padmanabhan 1 ' 4 *, Martin White 2 , Daniel J. Eisenstein 3 

1 Physics Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Rcl, Berkeley, CA 94720, USA. 

2 Department of Physics and Astronomy, 601 Campbell Hall, University of California Berkeley, CA 94720, USA. 

3 Steward Observatory, University of Arizona, 933 N. Cherry Ave, Tucson, AZ 85721, USA. 

4 Hubble Fellow, Chamberlain Fellow 



o 
o 



m 
(N 

> 

o 

(N 

O 

On 
i 

o 



13 



5 February 2008 



ABSTRACT 

We present a new estimator, u>, of the small scale galaxy correlation function that is robust 
against the effects of redshift space distortions and large scale structures. The estimator is a 
weighted integral of the redshift space or angular correlation function and is a convolution of 
the real space correlation function with a localized filter. This allows a direct comparison with 
theory, without modeling redshift space distortions and the large scale correlation function. 
This has a number of advantages over the more traditional w p estimator, including (i) an 
insensitivity to large scale structures and the details of the truncation of the line of sight 
integral, (ii) a well localized kernel in £(r), and (iii) being unbinned. We discuss how this 
estimator would be used in practice, applying it to a sample of mock galaxies selected from 
the Millennium simulation. 



1 INTRODUCTION 



The two point correlation function of galaxies is one of the fore- 
most probes of the physics of galaxy formation and evolution. Un- 
fortunately, peculiar velocities smear out the galaxy distribution 
along the line of sight, causing the observed 3D correlation func- 
tion, £ s (r), to deviate from the underlying isotropic correlation 
function, £(r). The most commonly employed s olution has been 
to co nsider the projected correlation function ( Davis & Peebles! 
1 19831) . 



w p (R) 



dZ £ s (R, Z) 



(1) 



where R and Z are the transverse and line of sight coordinates re- 
spectively. 

Although this estimator has enjoyed widespread u se (eg. see 
lHawkins et al.l2003MZehavi et all2005klCoil et alj200fj . for recent 
applications to the 2dFGRS,SDSS and DEEP2 redshift surveys), 
it has an important drawback. While the integral formally extends 
over the entire Z axis, it is truncated at Z max , where Z m&K 2> R is 
typically several tens of Mpc to avoid biases. This mixes in a large 
range of scales, strongly correlating measurements and making the 
estimator sensitive to possibly poorly measured long wavelength 
modes. Furthermore, the quantity of interest from the perspective 
of galaxy formation and evolution is £(r) where r is determined by 
the size of dark matter halos (~ 1 Mpc). Estimating this from w p 
requires disentangling it from large scale correlations. 

All of these problems can be solved by inverting the Abel in- 
tegral to obtain £(r), 



f(r) = -~ 



VR? 



dR dw v 



r 2 dR 



(2) 



Unfortunately, the resulting £(r) has severe anti-correlations be- 
tween bins, characteristic of all deconvolution problems. Further- 



more, these get worse as one reduces the bin sizes to avoid binning 
effects. To avoid these problems, it has become popular to attempt 
to theoretically model w p \ however, one is then left with the disad- 
vantages of w p . 

Motivated by this, we suggest a new statistic to that is a more 
robust estimate of the small and intermediate scale (~ 1 — 10 Mpc) 
correlation function. As the problematic modes in w p are slowly 
varying, we suggest high-pass filtering to remove them (Sec. I2.lt . 
Sec. l2.2l then describes a family of possible filters chosen to add the 
additional desirable properties of being compact in the projected 
correlation function and well localized in the 3D correlation func- 
tion. Sec. 12.31 describes how to estimate u) and demonstrates that 
it can be computed without any need to (arbitrarily) bin the data. 
Sec. |2.4| then tests these ideas with a sample of galaxies drawn from 
the Millennium simulation. We summarize in Sec. [3] 



2 A NEW CLUSTERING STATISTIC : u 
2.1 Theory 

If we truncate the integral in Eq.fJJat ±Z max , we introduce an error 



Awp(R) = 2 



dZ£, 3 (R,Z) 



(3) 



It is important to emphasize that the correlation function that ap- 
pears above is the redshift-space correlation function, and predict- 
ing it requires a complete model of redshift-space distortions, not 
just the isotropic correlation function. However, since for Z max ^> 
R, this error is a slowly varying function of R, one can reduce it by 
high-pass filtering, 



u(R e 



2tt / RdRG(R,R s )w p (R) 
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2vr / RdR G(R, R s ) 



dZ£ s (R,Z) (4) 



where the filter G(R, R s ) is designed to be compact, compensated 
(J R dR G(R) = 0) and have a characteristic scale R s . A compen- 
sated filter reduces the influence of slowly varying modes, coming 
from large r in f (r), while compactness avoids extrapolations of 
the data. We further assume G(R) has units of inverse volume, 
making u dimensionless. 

Using the above definitions, we can relate lu(R s ) to the real 
space correlation function £(r), 



uj(Rs) =4tt / drr£(r)W{r) 



where 



W(r) = - 
r 



RdR 



R? 



G(R) 



(5) 



(6) 



o 

Note that Eqs, l4l6l offer a natural generalization of Eq.[2] with a par- 
ticular "binning" that smooths over the anti-correlations in £ which 
come from inversion. 

In principle, one could generalize the above by considering a 
filter in R and Z, i.e. 



w = 27r / RdRdZG(R,Z,R a )i s (R,Z) 



(7) 



where G(R, Z, R s ) is chosen to maximize the S/N under the as- 
sumption of smoothness of the redshift distortions in Z. A sim- 
ple example would be to parameterize the k z dependence of the 
power spectrum by a low-order polynomial and extrapolate the fit to 
k z — 0. After much experimentation with basis choices, apodiza- 
tion and weighting we were unable to find an acceptable filter that 
converged to a real space statistic better than trivially weighting in 
Z, so we stick with our original unweighted integral. Note how- 
ever that one trivial generalization would be to allow Z max to vary 
with Rs . Whether or not this is useful will depend on details of the 
galaxy sample, and so we simply mention the possibility and defer 
experimentation to actual applications to real data. 

Finally, we observe that a similar formalism applies to the an- 
gular correlation function. If we consider a galaxy sample with red- 
shift distribution <f>, the angular correlation function w(8) is related 
to the 3D corre lation function by Limber's equation ( iLimbeJ 19531 ; 
|Peacock|[T99 9)n. 



(8) 



w W = J o d xtf{x) J dZ £ (j \e x f + z^j , 

where \ is me radial distance and J (f> d\ = 1- We now proceed as 
above defining 



uj(9 s ) = 2tt 



>G(e,e s ) w (9) 



(9) 



which probes scales around O s xo, with xo the characteristic dis- 
tance of the galaxy sample. As before, this is related to the 3D 
correlation function by Eq.[5]with W(r) replaced by Weir), 



We(r) 



d X 



W 



(10) 



1 Assuming that we are probing scales much less than the width of the 
redshift distribution. 
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Figure 1. Examples of G(R/R S ) chosen from the two parameter fam- 
ily described in the text, (o, /3)=(1,1) [dotted], (2,1) [short-dashed], (1,2) 
[long-dashed], and (2,2) [solid]. We adopt the (2,2) filter throughout this 
paper. The a;— axis is chosen such that the area under the curves is zero, 
while the filter normalization is arbitrary. Note that a controls the small-i? 
behaviour of the filter, while (3 determines its derivative at Ft ~ R s . 



where W is given by Eq.|6]with the integration variable interpreted 
as an angular coordinate. Note that this is the same window func- 
tion as for w p except that it is now smoothed by (f>. 



2.2 A Family of Filters 

A simple two parameter family of filters that satisfy the require- 
ments to be compact and compensated and that are analytically 
tractable are G(R) = R~ 3 G(x = R/R s ) witlfl 



G{x) 



"(1 



Y(c-z 2 ) 



and 



x < 1 
x > 1 



(11) 
(12) 



(13) 



where a, /3 = 1,2, 

a + 1 

C ~ a + [3 + 2 ' 

is determined by the requirement that the filter be compensated. 
The corresponding real space filters have the form 



W(y) 



= P(y) + 



P(y) 



-Q(y) 



y > i 



(14) 
(15) 



where P and Q are simple polynomials and y — r/R s . Table Q] 
lists the values for c, P(y) and Q(y) for the first four filters in this 
family, and Figs.[T]and |2]plot G(x) and W(y) for these filters. 

The choice of a and f3 determine the x — > and x — > 1 
behaviour of G(x). For small x, G(x) ~ x 2a ; increasing a re- 
duces the sensitivity of u) to measurements of the correlation func- 
tion on scales much smaller than the scales of interest. At the 
other end, /3 — 1 determines the number of derivatives of G(x) 
that vanish at x — 1. Increasing (3 increases the smoothness at 

2 For w(9) the prefactor is 0J 3 and x = 9/9 s . 
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5 


4 
1575 


126y 4 


- 288y 6 + ieOy s 




-4 
1575 


1 


f 5y 2 4 


- 42?/ 4 - 


208?/ 6 + 


160y 8 ] 


(2,2) 


1 

L> 


4 
3465 


231y 4 


- 792y 6 + 880y 8 


- 320y 10 ] 


-4 
3465 


1 


f 6j/ 2 4 


- 65J/ 4 - 


472j/ 6 + 


720y 8 - 320j/ 10 ] 



Table 1. Table of c, P(j/) and Q(y) for the first four niters in the simple two parameter family discussed in Sec. |2.2| 
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Figure 2. Corresponding VF(r/i? s ) curves for the filters shown in Fig. [T] 
The a;— axis is chosen to correspond to the appropriate measure in the in- 
tegral in Eq.|5] Note that the large r behaviour for all these filters is r~ 4 , 
corresponding to a singly compensated G filter. 



x = 1 and reduces susceptibility to ringing; however, large a and /3 
makes G(x) sharply peaked, increasing its susceptibility to noise in 
the correlation function. After some experimentation, we adopted 
(a, (5) = (2, 2) in what follows as a good compromise. Note that 
Eq.[l2]can be generalized to a iV— compensated filter 



(16) 



G N (x) = x 2a (l - x 2 ) Y\_( Ci ~ x2 ) ' 

i 

where the Ci are determined by the compensation conditions 

RdR R 2j G(R) = forO<j<iV . (17) 



For an iV-compensated filter W(r) ~ r _2 ( JV + 1 ) as r _» qq. How- 
ever, since the galaxy correlation function is falling off steeply with 
increasing radius, we find that a singly compensated filter is suffi- 
cient to remove the large radius contribution to u). For example, if 



uj(Rs) converges to 1% by r ~ 2R S . Furthermore, as 



an iV-compensated filter has N nodes, a multiply compensated fil- 
ter is rapidly oscillating and therefore sensitive to noise. For these 
reasons, we recommend using a singly compensated filter. Paren- 
thetically, we note that the polynomial representations of W(y) in 
Table[T]are not numerically ideal for j > 1. We therefore suggest 
switching to a series expansion in 1 /y for large y; for convenience, 
we give the expansion for the (2,2) filter here, 



W(y) = 



1 



+ 



+ 



3360y 4 4480y 6 32256y 



■ (18) 



As expected for a singly compensated filter, the leading term is of 
order y~ . 

The above discussion has focussed on a real space description; 
it is however instructive to relate uo to the 3D real space power 
spectrum, P(k). We recall that w p (R) is a Hankel transform of the 
power spectrum, 



w p (R) = ttR 



dk 2 Jo(kR) 

T A {k) ^R~ 



(19) 



where we have used the isotropy of the real-space power spectrum 
and A 2 (fc) = k 3 P(k)/2n 2 . Using Eq.g] we obtain 



^A 2 (k)W(k) 



where 



W(k) = 



27T 2 
k 



RdRG(R)Jo{kR) 



(20) 



(21) 



Although this integral can be done analytically for any polynomial 



G(R), we simply note that for kR s < 1, W(k) 



yet 



another manifestation of the insensitivity of u) to large scales (small 
k). In particular, W^O) = 0, implying that to is insensitive to the 
mean density of the sample and therefore, the integral constraint. 

We conclude this section with the exact analytic expression 
for u), assuming a power law correlation function, 



£(r) = 



for 7 > 1 . 



Integrating along the line of sight, one obtains 



w p (R) 



0Fr( 



-7 + 1 



(22) 



(23) 



Integrating along R, one obtains (for the (2,2) filter) 

= 27r 3 / 2 r(^-) 4( 7 -l) / />' 

F(i) (7- 7 )(9-t)(11-7)(13-7) Vn, 



(3 



•(24) 



Note that cu(R s ) oc £(R S ), allowing one to trivially translate be- 
tween £ and lo for the case of power law correlation functions. For 
7 ~ 2 the prefactor is a few per cent, so lo will be order unity on 
Mpc scales for modestly biased populations at z = (Fig.[3}. 

2.3 Computational Considerations 

The correlation function £ s (R, Z) can be estimated via 
DD(R,Z) 



(25) 



RR(R, Z) 

where DD(R, Z) is the number of galaxy pairs separated by R 
and Z, while RR(R, Z) is the analogous quantity for randomly 



© 0000 RAS, MNRAS 000, 000-000 



4 Padmanabhan et al. 



0.1 r 



0.01 r 




R (Mpc/h) 



Figure 3. Some examples of lu (assuming a (2,2) filter) : for £(r) = 
(r/lOlMpc/h])- 1 ' 5 [dotted], for f(r) = (r/3[Mpc//i])" 2 [dashed], 
and for a sample of galaxies (see text) from the Millennium simulation 
[solid]. 



distributed points. We assume that DD and RR are actual num- 
ber counts in a bin of a finite size, as opposed to density distribu- 
tions. Furthermore, although the number of random points is much 
greater than the number of data points no, we assume that RR is 
normalized to the number of data pairs, n 2 D . Our new statistic is 
simply an integral of the correlation function, 

rR s |-Z„„ 

lu(R s )=2tt dR dZ RG{R,R s )is{R,Z) . (26) 

Substituting Eq.[25]into the above (compare with Eq.HJl, we obtain 

uj{Rs)=2k J dR RG(R,R S ) J dZ ^r{r z) ' (27) 

where the constant piece vanishes due to the compensated nature of 
G(R), a manifestation of our insensitivity to the integral constraint. 

When estimating small-scale correlations, it is common to use 
small bins in R, which requires large random samples to com- 
pute RR accurately in each individual bin. This is particularly 
true because £ ^> 1 means that one needs random sets far larger 
than the data set to keep the shot noise in RR smaller than that 
in DD. Such RR summations easily dominate the computational 
time. This would be a serious problem for a naive implementation 
of Eq. [27] since the integral requires narrow bins in Z and R to 
avoid binning related errors. However, we can make an important 
simplification by noting that RR is a purely geometrical quantity 
that depends only on the survey geometry and the number of data 
points. We can therefore write it as0, 



RR(R, Z) = 2irR 2 n D n D $(R, Z) A In RAZ . 



(28) 



where no is the density of data points and the survey geometry is 
encoded in <&(R,Z). The advantage of this reformulation is that 



the form of $(R,Z) is highly constrained : < $(R,Z) < 1, 
&(R, Z) — > 1 as R, Z — *■ and it is both smooth and slowly 
varying for scales smaller that the survey size. This allows us to 
measure &(R, Z) by a simple Monte Carlo integration analogous 
to RR, but with significantly smaller random samples (for a given 
shot noise error), since we can smooth over many bins. For surveys 
where the angular and radial selection functions are separable, we 
can gain even further by separating &(R, Z) — 4>r(R)(/)z(Z), al- 
lowing us to project along each dimension and thus increase our 
statistics. 

The above allows us to consider arbitrarily fine bins in R and 
Z without the usual shot noise penalty in computing RR. We there- 
fore imagine a binning such that there is either zero or one DD pair 
per bin. Substituting this into Eq.|27] we can approximate the inte- 
gral by a Riemann sum; this transforms the integral into a sum over 
pairs in DD, 



C0(Rs) = 



where 



G{Ri 



^ n D n D &(Ri, Zi 



-H(Ri,Zi 



H(Ri,Zi) = e(Rs-Ri)G(Z n 



\Zi\) 



(29) 



(30) 



such that the Heaviside functions restrict the sum to pairs with R < 
R s and \Z\ < Z max - This avoids the need to ever bin the datfl 
Finally, we note that the above is trivially generalized to the Landy- 
Szalay dLandv & Szalavll 19931) correlation function estimator, 



iOLS = UDD/RR 

- 2 E 



G(Rj 



j£DR 



n D n D $(Rj,Zj) 



H(R iy Zi) , 



(31) 



where cu DD / rr is given by Eq.|29] and j runs over all data-random 
pairs. Note however that this second summation is computationally 
expensive for the reasons mentioned above. 



2.4 Testing the Filter 

In order to test the abov e ideas, we select a sample of 
galaxies (|C roton et al.| 120061) from the Millennium simulation 
dSpringel et al J 120051) with SDSS r band absolute magnitudes < 
—20.0 at z = 0, corresponding to approximately M, + 0.5 galax- 
ies with a space density of ~ 7.2 x 10~ 3 (/i/Mpc) 3 . This particu- 
lar sample of galaxies was chosen to minimize Poisson noise while 
approximating a typical sample of galaxies, but is otherwise arbi- 
trary. Assuming the distant observer approximation, we translate 
the galaxies to redshift space, and subdivide the simulation volume 
into fifty 100 x 100 x 250Mpc /h sub-volumes where the long axis 
corresponds to the redshift space direction. This yields 150 (using 
each of the coordinate axes as the redshift space direction in turn) 
samples of galaxies from which we measure lu, as discussed above. 

Fig. [4] shows the bias in w p and u (defined as bias = (/ — 
/rcf )//rcf where / = w p ,ui) for the above sample determined us- 
ing the estimator in Sec. 12.31 as a function of transverse scale and 
Z ma ,. The reference value is determined by projecting the redshift 
space correlation function to Z max = 100Mpc//i; using this as the 
reference mitigates the effects of sample variance and periodicity. 
The bias in ui is < 2% for ^ max > 40 Mpc//t; smaller scales are 



3 assuming certain mathematical niceties (eg. the survey geometry is not 4 We note as an aside that in a periodic simulation cube <E> = 1 and uj 

fractal) that are always true for real observations reduces to a sum over pairs weighted by G(Ri). 
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Figure 4. The fractional bias in ui (bias = (w — ui Ie { ) / cu re f ) obtained from 
the redshift space distribution of galaxies in the Millennium simulation for 
different Z max (in Mpc/h) - 20 (dotted), 40 (thick solid), 60 (dashed), and 
80 (dot-dashed). The true value (u> Te f) is assumed to be obtained by inte- 
grating to a Z max = 100Mpc/h; the figure shows that u> has converged 
to better than 2% by ^max — 40Mpc//i. The thin solid (red) line plots uj 
obtained by integrating over the real space correlation function. The fluctu- 
ations in Au> are consistent with measurement noise. For comparison, the 
points show the analogous bias terms for w p , with triangles, circles, and 
squares corresponding to Z max = 40, 60, and 80Mpc//i respectively. For 
a fair comparison with uj, we plot the bias in w p at R s /2, which roughly 
corresponds to the central scale probed by u). Note that the biases in w p are 
significantly larger than those for u. 



still affected by the nonlinear redshift space distortions. Further- 
more, we observe that our estimates of uj from the redshift space 
correlation function agree with a direct integration of the isotropic, 
real space correlation function. 

In contrast with the above, the estimates of w p are both biased 
at the 5-10 % level and show a significantly slower convergence 
to the true value. For instance, we find it necessary to integrate 
to Z max = 80Mpc//i to obtain biases < 5% (still significantly 
larger than the biases in uj) for R < 10Mpc//i. This serves to 
highlight the difficulties in measuring w p and the utility of uj as an 
alternate measure of galaxy clustering. Although the precise values 
of the bias will depend on the exact details of the galaxy sample, 
the qualitative aspects of the above is generically true. 



3 DISCUSSION 

We present an alternative, uj(R s ), to the commonly used w p (R) 
projected correlation function as a robust measure of the small 
scale (~ Mpc) galaxy correlation function. This is simply a fil- 
tered version of w p (R), and can be straightforwardly determined 
by a weighted sum of pairs in the data i.e. there is no reason to go 
through an intermediate step of estimating w p (R). The features of 
uj are : 



out to large scales (most often with the linear theory prediction). We 
reiterate that this is not true - the error in w p is determined by the 
redshift-space correlation function (Eq.[3j, and requires a model of 
redshift-space distortions on all scales. However, recall that it was 
our uncertainty in redshift-space distortions that led us to w p in the 
first place. 

In contrast, u converges to the real-space clustering statistic sig- 
nificantly faster than w v for similar transverse scales, making it in- 
sensitive to the precise value of Z max used. Seen in this context, uj 
completes the partial removal of redshift space information in w p , 
as originally intended. 

Furthermore, since uj converges significantly faster with Zmax, 
it is possible to truncate the underlying w p integral at a lower Z max 
than would have been naively possible, eliminating large scale 
noise and possibly reducing the errors in any downstream quan- 
tities derived from the data. The exact details of this are dependent 
on the exact details of the galaxy sample; we limit ourselves here 
to pointing out this possibility. 

(ii) Well localized in real space : The real space filter, W(r), is 
well localized in real space implying that uj(R s ) probes a relatively 
narrow range of scales around R s /2. 

(iii) Immune to the integral constraint : A corollary to the above 
is that uj(Rs) is immune to errors in the mean density, and therefore 
is insensitive to the integral constraint. 

(iv) Insensitivity to small scales : An appropriate choice of 
G(R) makes uj insensitive to measurements of clustering on very 
small scales. This is important as it is these scales that are the most 
sensitive to systematics in galaxy selection. 

(v) Unbinned : uj(R s ) is a naturally unbinned quantity, remov- 
ing any need for an arbitrary choice of binning. 

Finally, we point out that there is a natural generalization of uj for 
angular correlations. 
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AST-0407200. This document is LBNL Report LBNL-62026. 



REFERENCES 

Coil A. L., Newman J. A., Cooper M. C, Davis M., Faber S. M., 

Koo D. C, Willmer C. N. A., 2006, ApJ, 644, 671 
Croton D. J. et al., 2006, MNRAS, 365, 1 1 
Davis M., Peebles P. J. E., 1983, ApJ, 267, 465 
Hawkins E. et al., 2003, MNRAS, 346, 78 
Landy S. D., Szalay A. S., 1993, ApJ, 412, 64 
Limber D. N., 1953, ApJ, 117, 134 

Peacock J. A., 1999, Cosmological physics. Cambridge Univer- 
sity Press 

Springel V. et al., 2005, Nature, 435, 629 
Zehavi I. et al., 2005, ApJ, 621, 22 



(i) Improved Convergence (with Z maX/ ) to the real space cluster- 
ing statistic : It is tempting to believe that one can model the error in 
w p introduced by the Z max truncation, simply by specifying £(r) 
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